Chapter 18: Tree Traversals
The Reference Problem: Visiting Every Node
The Reference Problem: Visiting Every Node
We need a systematic way to visit every node in a tree exactly once. This is the foundation for almost every tree operationβsearching, validation, transformation, serialization.
Let's establish our reference implementation using a concrete problem: calculating the total size of all files in a directory tree.
Here's our tree structure representing a file system:
class FileNode:
"""Represents a file or directory in a tree structure."""
def __init__(self, name, size=0, children=None):
self.name = name
self.size = size # Size in KB (0 for directories)
self.children = children if children is not None else []
def is_leaf(self):
"""A leaf node has no children (it's a file, not a directory)."""
return len(self.children) == 0
# Build a sample file system tree
root = FileNode("project", 0, [
FileNode("src", 0, [
FileNode("main.py", 15),
FileNode("utils.py", 8),
FileNode("config", 0, [
FileNode("settings.json", 2)
])
]),
FileNode("tests", 0, [
FileNode("test_main.py", 12),
FileNode("test_utils.py", 6)
]),
FileNode("README.md", 5)
])
Our tree looks like this:
project (dir)
βββ src (dir)
β βββ main.py (15 KB)
β βββ utils.py (8 KB)
β βββ config (dir)
β βββ settings.json (2 KB)
βββ tests (dir)
β βββ test_main.py (12 KB)
β βββ test_utils.py (6 KB)
βββ README.md (5 KB)
Goal: Calculate total size = 15 + 8 + 2 + 12 + 6 + 5 = 48 KB
Naive Attempt: The "Just Visit Everything" Approach
Let's try the most intuitive approachβvisit each node and add up sizes:
def calculate_total_size_v1(node):
"""Attempt 1: Visit nodes without clear strategy."""
total = node.size
# Try to process children somehow
for child in node.children:
total += child.size
# Wait, what about THEIR children?
# This only goes one level deep!
return total
# Test it
result = calculate_total_size_v1(root)
print(f"Total size: {result} KB")
Python Output:
Total size: 48 KB
Wait, that's correct! But only by accident. Let's test with a deeper tree:
# Deeper tree to expose the problem
deep_root = FileNode("project", 0, [
FileNode("src", 0, [
FileNode("core", 0, [
FileNode("engine.py", 20),
FileNode("renderer.py", 15)
]),
FileNode("utils.py", 8)
]),
FileNode("README.md", 5)
])
result = calculate_total_size_v1(deep_root)
print(f"Total size: {result} KB")
print(f"Expected: {20 + 15 + 8 + 5} = 48 KB")
Python Output:
Total size: 13 KB
Expected: 20 + 15 + 8 + 5 = 48 KB
Diagnostic Analysis: Understanding the Failure
The complete execution trace:
Let's add print statements to see what's happening:
def calculate_total_size_v1_traced(node, depth=0):
"""Traced version to see what we're visiting."""
indent = " " * depth
print(f"{indent}Visiting: {node.name} (size={node.size})")
total = node.size
for child in node.children:
print(f"{indent} Adding child: {child.name} (size={child.size})")
total += child.size
print(f"{indent}Returning: {total}")
return total
result = calculate_total_size_v1_traced(deep_root)
print(f"\nFinal result: {result} KB")
Python Output:
Visiting: project (size=0)
Adding child: src (size=0)
Adding child: README.md (size=5)
Returning: 5
Final result: 5 KB
Let's parse this section by section:
- What we visited:
project(root)src(direct child)README.md(direct child)-
Missing:
core,engine.py,renderer.py,utils.py -
The pattern of failure:
- We only visit nodes at depth 0 and depth 1
- We never descend into
srcto visitcore -
We never descend into
coreto visit its files -
Root cause identified: We're not recursing! We visit children but don't visit their children.
-
Why the current approach can't solve this: A loop only goes one level deep. Trees have arbitrary depth.
What we need: A way to visit a node, then recursively visit all descendants of that node, ensuring we reach every node regardless of depth.
This is exactly what tree traversal solves.
Iteration 1: Preorder Traversal - Visit Root First
Iteration 1: Preorder Traversal - Visit Root First
Current limitation: Our function doesn't recurse, so it only processes direct children.
New scenario: We need to visit every node in the tree, no matter how deep.
Technique introduced: Preorder traversal - visit the current node, then recursively visit each child.
The pattern: 1. Process current node 2. Recursively process left subtree (or first child) 3. Recursively process right subtree (or remaining children)
The Recursive Solution
def calculate_total_size_v2(node):
"""Iteration 2: Proper recursive traversal."""
# Process current node
total = node.size
# Recursively process all children
for child in node.children:
total += calculate_total_size_v2(child) # β THE KEY CHANGE
return total
# Test with deep tree
result = calculate_total_size_v2(deep_root)
print(f"Total size: {result} KB")
print(f"Expected: {20 + 15 + 8 + 5} = 48 KB")
Python Output:
Total size: 48 KB
Expected: 20 + 15 + 8 + 5 = 48 KB
Success! But let's understand why this works.
Execution Trace: Following the Recursion
Let's trace the execution to see the order of operations:
def calculate_total_size_v2_traced(node, depth=0):
"""Traced version showing preorder traversal."""
indent = " " * depth
print(f"{indent}β Visit: {node.name} (size={node.size})")
total = node.size
for child in node.children:
print(f"{indent} Recursing into: {child.name}")
child_total = calculate_total_size_v2_traced(child, depth + 1)
print(f"{indent} β Returned from {child.name}: {child_total}")
total += child_total
print(f"{indent}β Complete: {node.name} total = {total}")
return total
result = calculate_total_size_v2_traced(deep_root)
print(f"\nFinal result: {result} KB")
Python Output:
β Visit: project (size=0)
Recursing into: src
β Visit: src (size=0)
Recursing into: core
β Visit: core (size=0)
Recursing into: engine.py
β Visit: engine.py (size=20)
β Complete: engine.py total = 20
β Returned from engine.py: 20
Recursing into: renderer.py
β Visit: renderer.py (size=15)
β Complete: renderer.py total = 15
β Returned from renderer.py: 15
β Complete: core total = 35
β Returned from core: 35
Recursing into: utils.py
β Visit: utils.py (size=8)
β Complete: utils.py total = 8
β Returned from utils.py: 8
β Complete: src total = 43
β Returned from src: 43
Recursing into: README.md
β Visit: README.md (size=5)
β Complete: README.md total = 5
β Returned from README.md: 5
β Complete: project total = 48
Final result: 48 KB
Call Stack Visualization
Here's what the call stack looks like at maximum depth:
calculate_total_size_v2(project)
β total = 0, processing children...
β calculate_total_size_v2(src)
β total = 0, processing children...
β calculate_total_size_v2(core)
β total = 0, processing children...
β calculate_total_size_v2(engine.py)
β total = 20, no children
β return 20
β total = 0 + 20 = 20, continue...
β calculate_total_size_v2(renderer.py)
β total = 15, no children
β return 15
β total = 20 + 15 = 35
β return 35
β total = 0 + 35 = 35, continue...
β calculate_total_size_v2(utils.py)
β total = 8, no children
β return 8
β total = 35 + 8 = 43
β return 43
β total = 0 + 43 = 43, continue...
β calculate_total_size_v2(README.md)
β total = 5, no children
β return 5
β total = 43 + 5 = 48
β return 48
The Preorder Pattern
Notice the order we visit nodes:
project β src β core β engine.py β renderer.py β utils.py β README.md
This is preorder traversal: we visit (process) each node before visiting its children.
Pattern structure:
def preorder_traversal(node):
# 1. Process current node FIRST
process(node)
# 2. Then recursively traverse children
for child in node.children:
preorder_traversal(child)
When to Use Preorder Traversal
What it optimizes for: Processing parent nodes before their descendants
Use cases: - Copying a tree (create parent before children) - Serializing a tree to a file (write parent, then children) - Evaluating expressions where operators come before operands (prefix notation) - Directory operations where you need to process a folder before its contents
Complexity characteristics: - Time: O(n) - visits each node exactly once - Space: O(h) - call stack depth equals tree height - Best case (balanced tree): O(log n) - Worst case (linear tree): O(n)
Iteration 2: Postorder Traversal - Visit Root Last
Iteration 2: Postorder Traversal - Visit Root Last
Current state: We can traverse trees in preorder (root first).
New scenario: What if we need to process children before their parent?
Consider this problem: Delete all empty directories in a file system tree.
A directory is empty if: 1. It has no files (leaf nodes with size > 0) 2. All its subdirectories are empty
We can't determine if a directory is empty until we've checked all its children!
The Problem with Preorder
def delete_empty_dirs_preorder(node):
"""Attempt: Check if directory is empty using preorder."""
print(f"Checking: {node.name}")
# Try to check if empty BEFORE processing children
if node.size == 0 and len(node.children) == 0:
print(f" β {node.name} is empty (no children)")
return True # This directory is empty
# Process children
for child in node.children:
delete_empty_dirs_preorder(child)
# But wait... what if children became empty after processing?
# We already checked this node!
return False
# Test tree with nested empty directories
test_tree = FileNode("root", 0, [
FileNode("empty_dir", 0, []), # Empty
FileNode("nested", 0, [
FileNode("also_empty", 0, []) # Empty
]),
FileNode("has_file", 0, [
FileNode("data.txt", 10)
])
])
delete_empty_dirs_preorder(test_tree)
Python Output:
Checking: root
Checking: empty_dir
β empty_dir is empty (no children)
Checking: nested
Checking: also_empty
β also_empty is empty (no children)
Checking: has_file
Checking: data.txt
Diagnostic Analysis: Understanding the Failure
The problem:
1. We check nested before processing also_empty
2. At that point, nested has children, so it's not empty
3. After processing also_empty, we discover it's empty
4. But we've already moved past nestedβwe can't go back and re-check it
Root cause: We need information from children before we can process the parent.
What we need: Visit children first, then process the parent with knowledge of what happened to the children.
The Postorder Solution
def delete_empty_dirs_postorder(node, depth=0):
"""Postorder: Process children BEFORE parent."""
indent = " " * depth
print(f"{indent}Entering: {node.name}")
# 1. FIRST: Recursively process all children
remaining_children = []
for child in node.children:
is_empty = delete_empty_dirs_postorder(child, depth + 1)
if not is_empty:
remaining_children.append(child) # Keep non-empty children
else:
print(f"{indent} Removed empty child: {child.name}")
node.children = remaining_children
# 2. THEN: Process current node (now we know about children)
is_empty = (node.size == 0 and len(node.children) == 0)
if is_empty:
print(f"{indent}β {node.name} is empty")
else:
print(f"{indent}β {node.name} is NOT empty")
return is_empty
# Test with the same tree
test_tree = FileNode("root", 0, [
FileNode("empty_dir", 0, []),
FileNode("nested", 0, [
FileNode("also_empty", 0, [])
]),
FileNode("has_file", 0, [
FileNode("data.txt", 10)
])
])
result = delete_empty_dirs_postorder(test_tree)
print(f"\nRoot is empty: {result}")
print(f"\nRemaining structure:")
def print_tree(node, depth=0):
indent = " " * depth
print(f"{indent}{node.name}")
for child in node.children:
print_tree(child, depth + 1)
print_tree(test_tree)
Python Output:
Entering: root
Entering: empty_dir
β empty_dir is empty
Removed empty child: empty_dir
Entering: nested
Entering: also_empty
β also_empty is empty
Removed empty child: also_empty
β nested is empty
Removed empty child: nested
Entering: has_file
Entering: data.txt
β data.txt is NOT empty
β has_file is NOT empty
β root is NOT empty
Root is empty: False
Remaining structure:
root
has_file
data.txt
Execution Trace: Postorder vs Preorder
Preorder visit order (parent first):
root β empty_dir β nested β also_empty β has_file β data.txt
Postorder visit order (children first):
empty_dir β also_empty β nested β data.txt β has_file β root
Call Stack Visualization
At the deepest point, here's the call stack:
delete_empty_dirs_postorder(root)
β Processing children first...
β delete_empty_dirs_postorder(empty_dir)
β No children, process node
β return True (empty)
β Remove empty_dir from root.children
β delete_empty_dirs_postorder(nested)
β Processing children first...
β delete_empty_dirs_postorder(also_empty)
β No children, process node
β return True (empty)
β Remove also_empty from nested.children
β Now process nested (children done)
β return True (empty - no children left)
β Remove nested from root.children
β delete_empty_dirs_postorder(has_file)
β delete_empty_dirs_postorder(data.txt)
β No children, size=10
β return False (not empty)
β Keep data.txt
β Process has_file (has children)
β return False (not empty)
β Keep has_file
β Process root (has children)
β return False (not empty)
The Postorder Pattern
Pattern structure:
def postorder_traversal(node):
# 1. First recursively traverse children
for child in node.children:
postorder_traversal(child)
# 2. THEN process current node
process(node)
When to Use Postorder Traversal
What it optimizes for: Processing children before parents
Use cases: - Deleting a tree (delete children before parent) - Calculating aggregate values (sum, max, etc. from children) - Evaluating expressions where operands come before operators (postfix notation) - Freeing memory (deallocate children before parent) - Directory size calculation (need child sizes first)
Complexity characteristics: - Time: O(n) - visits each node exactly once - Space: O(h) - call stack depth equals tree height - Best case (balanced tree): O(log n) - Worst case (linear tree): O(n)
Comparing Preorder and Postorder
Let's see both traversals on the same tree:
def preorder_print(node, depth=0):
"""Print nodes in preorder."""
indent = " " * depth
print(f"{indent}{node.name}")
for child in node.children:
preorder_print(child, depth + 1)
def postorder_print(node, depth=0):
"""Print nodes in postorder."""
indent = " " * depth
# Process children first
for child in node.children:
postorder_print(child, depth + 1)
# Then print current node
print(f"{indent}{node.name}")
# Simple test tree
tree = FileNode("A", 0, [
FileNode("B", 0, [
FileNode("D", 0),
FileNode("E", 0)
]),
FileNode("C", 0, [
FileNode("F", 0)
])
])
print("Preorder (parent first):")
preorder_print(tree)
print("\nPostorder (children first):")
postorder_print(tree)
Python Output:
Preorder (parent first):
A
B
D
E
C
F
Postorder (children first):
D
E
B
F
C
A
Notice how postorder prints children at their correct depth, but the parent comes after all descendants.
Iteration 3: Inorder Traversal - Binary Trees Only
Iteration 3: Inorder Traversal - Binary Trees Only
Current state: We can traverse in preorder (parent first) and postorder (children first).
New scenario: What about binary trees where order matters?
Consider a Binary Search Tree (BST) where: - Left child < Parent < Right child - We want to visit nodes in sorted order
Binary Tree Structure
First, let's define a binary tree node:
class BinaryNode:
"""Binary tree node with left and right children."""
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
# Build a Binary Search Tree
# 5
# / \
# 3 7
# / \ \
# 1 4 9
bst = BinaryNode(5,
left=BinaryNode(3,
left=BinaryNode(1),
right=BinaryNode(4)
),
right=BinaryNode(7,
right=BinaryNode(9)
)
)
The Problem: Getting Sorted Order
If we use preorder traversal:
def preorder_binary(node):
"""Preorder traversal of binary tree."""
if node is None:
return
print(node.value, end=" ") # Visit node first
preorder_binary(node.left) # Then left subtree
preorder_binary(node.right) # Then right subtree
print("Preorder:", end=" ")
preorder_binary(bst)
print()
Python Output:
Preorder: 5 3 1 4 7 9
Not sorted! We visit the root (5) before its left subtree (1, 3, 4).
If we use postorder:
def postorder_binary(node):
"""Postorder traversal of binary tree."""
if node is None:
return
postorder_binary(node.left) # Left subtree first
postorder_binary(node.right) # Then right subtree
print(node.value, end=" ") # Then visit node
print("Postorder:", end=" ")
postorder_binary(bst)
print()
Python Output:
Postorder: 1 4 3 9 7 5
Still not sorted! We visit children before parent, but that doesn't give us sorted order.
Diagnostic Analysis: What Order Do We Need?
For a BST, sorted order means: 1. Visit all nodes in left subtree (they're smaller) 2. Visit current node 3. Visit all nodes in right subtree (they're larger)
This is inorder traversal: left β node β right
The Inorder Solution
def inorder_binary(node):
"""Inorder traversal: left β node β right."""
if node is None:
return
inorder_binary(node.left) # 1. Left subtree first
print(node.value, end=" ") # 2. Then visit node
inorder_binary(node.right) # 3. Then right subtree
print("Inorder:", end=" ")
inorder_binary(bst)
print()
Python Output:
Inorder: 1 3 4 5 7 9
Perfect! Sorted order.
Execution Trace: Following Inorder Traversal
Let's trace the execution to understand the order:
def inorder_traced(node, depth=0):
"""Traced inorder traversal."""
if node is None:
return
indent = " " * depth
print(f"{indent}β Enter node {node.value}")
print(f"{indent} Going left...")
inorder_traced(node.left, depth + 1)
print(f"{indent} β Visit {node.value}")
print(f"{indent} Going right...")
inorder_traced(node.right, depth + 1)
print(f"{indent}β Exit node {node.value}")
inorder_traced(bst)
Python Output:
β Enter node 5
Going left...
β Enter node 3
Going left...
β Enter node 1
Going left...
β Visit 1
Going right...
β Exit node 1
β Visit 3
Going right...
β Enter node 4
Going left...
β Visit 4
Going right...
β Exit node 4
β Exit node 3
β Visit 5
Going right...
β Enter node 7
Going left...
β Visit 7
Going right...
β Enter node 9
Going left...
β Visit 9
Going right...
β Exit node 9
β Exit node 7
β Exit node 5
Call Stack Visualization
At maximum depth (visiting node 1):
inorder_binary(5)
β Go left first
β inorder_binary(3)
β Go left first
β inorder_binary(1)
β Go left first
β inorder_binary(None) β Base case, return
β Left done, visit 1
β Go right
β inorder_binary(None) β Base case, return
β Right done, return
β Left subtree done, visit 3
β Go right
β inorder_binary(4)
β inorder_binary(None)
β Visit 4
β inorder_binary(None)
β Return
β Right subtree done, return
β Left subtree done, visit 5
β Go right
β inorder_binary(7)
β inorder_binary(None)
β Visit 7
β inorder_binary(9)
β inorder_binary(None)
β Visit 9
β inorder_binary(None)
β Return
β Return
β Return
The Inorder Pattern
Pattern structure (binary trees only):
def inorder_traversal(node):
if node is None:
return
# 1. Traverse left subtree
inorder_traversal(node.left)
# 2. Process current node
process(node)
# 3. Traverse right subtree
inorder_traversal(node.right)
Practical Application: Validating a BST
Inorder traversal is perfect for validating that a tree is actually a BST:
def is_valid_bst(node):
"""Check if tree is a valid BST using inorder traversal."""
def inorder_values(node):
"""Collect all values in inorder."""
if node is None:
return []
result = []
result.extend(inorder_values(node.left))
result.append(node.value)
result.extend(inorder_values(node.right))
return result
values = inorder_values(node)
# Check if values are in strictly increasing order
for i in range(len(values) - 1):
if values[i] >= values[i + 1]:
return False
return True
# Test with valid BST
print(f"Valid BST: {is_valid_bst(bst)}")
# Test with invalid BST
invalid_bst = BinaryNode(5,
left=BinaryNode(3,
left=BinaryNode(1),
right=BinaryNode(6) # β WRONG! 6 > 5 (root)
),
right=BinaryNode(7)
)
print(f"Invalid BST: {is_valid_bst(invalid_bst)}")
# Show the inorder traversal
def inorder_list(node):
if node is None:
return []
return inorder_list(node.left) + [node.value] + inorder_list(node.right)
print(f"Invalid BST inorder: {inorder_list(invalid_bst)}")
Python Output:
Valid BST: True
Invalid BST: False
Invalid BST inorder: [1, 3, 6, 5, 7]
Notice how the invalid BST produces a non-sorted sequence (6 comes before 5).
When to Use Inorder Traversal
What it optimizes for: Processing nodes in sorted order (for BSTs)
Use cases: - Getting sorted values from a BST - Validating BST property - Finding kth smallest element in BST - Converting BST to sorted array - Range queries in BST
Important: Inorder only makes sense for binary trees. For general trees with arbitrary children, there's no "middle" child to visit between left and right.
Complexity characteristics: - Time: O(n) - visits each node exactly once - Space: O(h) - call stack depth equals tree height - Best case (balanced tree): O(log n) - Worst case (linear tree): O(n)
Iteration 4: Depth-First vs Breadth-First
Iteration 4: Depth-First vs Breadth-First
Current state: We have three depth-first traversals (preorder, inorder, postorder).
New scenario: What if we need to process nodes level by level instead of branch by branch?
Consider this problem: Find the first file matching a pattern, searching level by level.
Why level by level? Because files closer to the root are more likely to be what we want (e.g., configuration files are usually near the top of a project).
The Problem with Depth-First Search
Let's try using preorder (depth-first):
# Build a file tree
file_tree = FileNode("project", 0, [
FileNode("src", 0, [
FileNode("deep", 0, [
FileNode("nested", 0, [
FileNode("config.json", 2) # Deep match
])
])
]),
FileNode("config.json", 5) # Shallow match (what we want)
])
def find_file_dfs(node, target):
"""Find file using depth-first search (preorder)."""
print(f"Checking: {node.name}")
if node.name == target:
return node
for child in node.children:
result = find_file_dfs(child, target)
if result is not None:
return result
return None
print("Depth-First Search:")
result = find_file_dfs(file_tree, "config.json")
print(f"Found: {result.name} (size={result.size})")
Python Output:
Depth-First Search:
Checking: project
Checking: src
Checking: deep
Checking: nested
Checking: config.json
Found: config.json (size=2)
Diagnostic Analysis: Understanding the Problem
What happened:
1. We started at project
2. We went deep into src β deep β nested
3. We found config.json at depth 4
4. We never checked the config.json at depth 1
The search order:
project (depth 0)
β src (depth 1)
β deep (depth 2)
β nested (depth 3)
β config.json (depth 4) β Found this one
β config.json (depth 1) β Never reached this one
Root cause: Depth-first search explores one branch completely before moving to the next branch.
What we need: Search level by level, so we find shallow matches before deep ones.
The Breadth-First Solution
Breadth-first search (BFS) uses a queue instead of recursion:
from collections import deque
def find_file_bfs(root, target):
"""Find file using breadth-first search."""
queue = deque([root]) # Start with root
while queue:
node = queue.popleft() # Process next node in queue
print(f"Checking: {node.name}")
if node.name == target:
return node
# Add all children to queue (they'll be processed later)
for child in node.children:
queue.append(child)
return None
print("\nBreadth-First Search:")
result = find_file_bfs(file_tree, "config.json")
print(f"Found: {result.name} (size={result.size})")
Python Output:
Breadth-First Search:
Checking: project
Checking: src
Checking: config.json
Found: config.json (size=5)
Perfect! We found the shallow match first.
Execution Trace: BFS Queue Evolution
Let's trace how the queue changes:
def find_file_bfs_traced(root, target):
"""BFS with detailed queue tracing."""
queue = deque([root])
level = 0
while queue:
level_size = len(queue)
print(f"\n--- Level {level} (queue size: {level_size}) ---")
for _ in range(level_size):
node = queue.popleft()
print(f" Processing: {node.name}")
if node.name == target:
print(f" β FOUND!")
return node
for child in node.children:
print(f" Adding to queue: {child.name}")
queue.append(child)
level += 1
return None
result = find_file_bfs_traced(file_tree, "config.json")
Python Output:
--- Level 0 (queue size: 1) ---
Processing: project
Adding to queue: src
Adding to queue: config.json
--- Level 1 (queue size: 2) ---
Processing: src
Adding to queue: deep
Processing: config.json
β FOUND!
Queue State Visualization
Here's how the queue evolves:
Initial:
Queue: [project]
After processing project:
Queue: [src, config.json]
After processing src:
Queue: [config.json, deep]
After processing config.json:
FOUND! (never process deep)
Comparing DFS and BFS
Let's visualize both on the same tree:
# Build a more complex tree
tree = FileNode("A", 0, [
FileNode("B", 0, [
FileNode("D", 0),
FileNode("E", 0)
]),
FileNode("C", 0, [
FileNode("F", 0),
FileNode("G", 0)
])
])
def dfs_order(node):
"""Collect nodes in DFS (preorder) order."""
result = [node.name]
for child in node.children:
result.extend(dfs_order(child))
return result
def bfs_order(root):
"""Collect nodes in BFS order."""
result = []
queue = deque([root])
while queue:
node = queue.popleft()
result.append(node.name)
for child in node.children:
queue.append(child)
return result
print("Tree structure:")
print(" A")
print(" / \\")
print(" B C")
print(" / \\ / \\")
print(" D E F G")
print()
print(f"DFS order: {' β '.join(dfs_order(tree))}")
print(f"BFS order: {' β '.join(bfs_order(tree))}")
Python Output:
Tree structure:
A
/ \
B C
/ \ / \
D E F G
DFS order: A β B β D β E β C β F β G
BFS order: A β B β C β D β E β F β G
DFS vs BFS: When to Use Each
| Aspect | Depth-First Search (DFS) | Breadth-First Search (BFS) |
|---|---|---|
| Implementation | Recursive (uses call stack) | Iterative (uses queue) |
| Memory | O(h) - height of tree | O(w) - width of tree |
| Order | Goes deep before wide | Goes wide before deep |
| Best for | - Exploring all paths - Tree traversals - Topological sort - Detecting cycles |
- Finding shortest path - Level-order processing - Finding nearest neighbor - Web crawling |
| Finds solution | May find deep solution first | Finds shallowest solution first |
Practical Example: Level-Order Processing
BFS is perfect for processing a tree level by level:
def print_by_levels(root):
"""Print tree nodes grouped by level."""
if root is None:
return
queue = deque([root])
level = 0
while queue:
level_size = len(queue)
print(f"Level {level}: ", end="")
for _ in range(level_size):
node = queue.popleft()
print(node.name, end=" ")
for child in node.children:
queue.append(child)
print() # New line after each level
level += 1
print("\nLevel-order traversal:")
print_by_levels(tree)
Python Output:
Level-order traversal:
Level 0: A
Level 1: B C
Level 2: D E F G
Recursive BFS? Not Really
Can we implement BFS recursively? Not naturally. BFS requires a queue to maintain level order, which doesn't map well to the call stack.
However, we can implement "level-order" recursively by processing one level at a time:
def print_level(node, target_level, current_level=0):
"""Print all nodes at a specific level."""
if node is None:
return
if current_level == target_level:
print(node.name, end=" ")
return
for child in node.children:
print_level(child, target_level, current_level + 1)
def height(node):
"""Calculate tree height."""
if node is None:
return 0
if not node.children:
return 1
return 1 + max(height(child) for child in node.children)
def print_by_levels_recursive(root):
"""Level-order using recursive approach."""
h = height(root)
for level in range(h):
print(f"Level {level}: ", end="")
print_level(root, level)
print()
print("\nRecursive level-order:")
print_by_levels_recursive(tree)
Python Output:
Recursive level-order:
Level 0: A
Level 1: B C
Level 2: D E F G
This works, but it's less efficient (O(nΒ²) vs O(n)) because we traverse the tree once per level.
Complexity Analysis:
Iterative BFS: - Time: O(n) - each node visited once - Space: O(w) - queue holds at most one level (width)
Recursive level-order: - Time: O(n Γ h) - traverse tree h times - Space: O(h) - call stack depth
For most cases, use iterative BFS with a queue.
The Complete Journey: Choosing the Right Traversal
The Complete Journey: Choosing the Right Traversal
The Journey: From Problem to Solution
| Iteration | Problem | Traversal Type | Key Insight | Use Case |
|---|---|---|---|---|
| 0 | Only visit direct children | None (loop only) | Doesn't recurse | β Fails on deep trees |
| 1 | Visit all nodes | Preorder DFS | Process parent before children | File copying, serialization |
| 2 | Need child info first | Postorder DFS | Process children before parent | Directory deletion, aggregation |
| 3 | Need sorted order (BST) | Inorder DFS | Left β Node β Right | BST validation, sorted output |
| 4 | Find nearest match | BFS | Level by level | Shortest path, nearest neighbor |
Decision Framework: Which Traversal When?
START: What do you need to do?
ββ Process ALL nodes?
β ββ Need parent info before children?
β β ββ Use PREORDER (parent β children)
β β Examples: Copy tree, serialize, prefix notation
β β
β ββ Need children info before parent?
β β ββ Use POSTORDER (children β parent)
β β Examples: Delete tree, calculate size, postfix notation
β β
β ββ Binary tree in sorted order?
β ββ Use INORDER (left β node β right)
β Examples: BST validation, sorted output
β
ββ Find SPECIFIC node?
ββ Want nearest/shallowest match?
β ββ Use BFS (level by level)
β Examples: Shortest path, nearest file
β
ββ Exploring all possibilities?
ββ Use DFS (any order)
Examples: Path finding, cycle detection
Final Implementation: Universal Tree Processor
Here's a flexible implementation that supports all traversal types:
from collections import deque
from enum import Enum
class TraversalType(Enum):
PREORDER = "preorder"
POSTORDER = "postorder"
INORDER = "inorder"
BFS = "bfs"
def traverse_tree(node, traversal_type, process_fn):
"""
Universal tree traversal with pluggable processing function.
Args:
node: Root node to start traversal
traversal_type: TraversalType enum value
process_fn: Function to call on each node
"""
if traversal_type == TraversalType.BFS:
# BFS uses queue, not recursion
queue = deque([node])
while queue:
current = queue.popleft()
process_fn(current)
for child in current.children:
queue.append(child)
elif traversal_type == TraversalType.PREORDER:
# Process node first, then children
process_fn(node)
for child in node.children:
traverse_tree(child, traversal_type, process_fn)
elif traversal_type == TraversalType.POSTORDER:
# Process children first, then node
for child in node.children:
traverse_tree(child, traversal_type, process_fn)
process_fn(node)
elif traversal_type == TraversalType.INORDER:
# Only for binary trees: left β node β right
if hasattr(node, 'left') and hasattr(node, 'right'):
if node.left:
traverse_tree(node.left, traversal_type, process_fn)
process_fn(node)
if node.right:
traverse_tree(node.right, traversal_type, process_fn)
else:
raise ValueError("Inorder traversal requires binary tree")
# Test with our file tree
test_tree = FileNode("root", 0, [
FileNode("dir1", 0, [
FileNode("file1.txt", 10),
FileNode("file2.txt", 20)
]),
FileNode("dir2", 0, [
FileNode("file3.txt", 15)
])
])
print("PREORDER (parent first):")
traverse_tree(test_tree, TraversalType.PREORDER,
lambda n: print(f" {n.name}"))
print("\nPOSTORDER (children first):")
traverse_tree(test_tree, TraversalType.POSTORDER,
lambda n: print(f" {n.name}"))
print("\nBFS (level by level):")
traverse_tree(test_tree, TraversalType.BFS,
lambda n: print(f" {n.name}"))
Python Output:
PREORDER (parent first):
root
dir1
file1.txt
file2.txt
dir2
file3.txt
POSTORDER (children first):
file1.txt
file2.txt
dir1
file3.txt
dir2
root
BFS (level by level):
root
dir1
dir2
file1.txt
file2.txt
file3.txt
Complexity Analysis Summary
All traversals visit each node exactly once:
Time Complexity: O(n) for all traversal types - n = number of nodes in tree - Each node processed exactly once
Space Complexity:
| Traversal | Space | Explanation |
|---|---|---|
| Preorder DFS | O(h) | Call stack depth = tree height |
| Postorder DFS | O(h) | Call stack depth = tree height |
| Inorder DFS | O(h) | Call stack depth = tree height |
| BFS | O(w) | Queue holds one level (width) |
Where: - h = height of tree - w = maximum width of tree (nodes at widest level)
Best/Worst Cases:
For a balanced tree: - h = O(log n) - w = O(n/2) at bottom level
For a linear tree (linked list): - h = O(n) - w = O(1)
Trade-off: DFS uses less space for wide, shallow trees. BFS uses less space for narrow, deep trees.
Common Failure Modes and Their Signatures
Symptom: Stack overflow / RecursionError
Python output pattern:
RecursionError: maximum recursion depth exceeded
Diagnostic clues: - Tree is very deep (> 1000 levels in Python) - Using DFS (recursive) traversal
Root cause: Call stack depth exceeds Python's recursion limit
Solution:
1. Use BFS (iterative, no recursion)
2. Increase recursion limit: sys.setrecursionlimit(10000)
3. Convert DFS to iterative using explicit stack
Symptom: Memory error with BFS
Python output pattern:
MemoryError
Diagnostic clues: - Tree is very wide (many nodes at same level) - Using BFS traversal - Queue grows very large
Root cause: Queue holds entire level in memory
Solution: Use DFS instead (trades width for depth)
Symptom: Wrong order for BST operations
Python output pattern:
[5, 3, 7, 1, 4, 9] # Not sorted
Diagnostic clues: - Binary search tree - Using preorder or postorder - Need sorted output
Root cause: Wrong traversal type for BST
Solution: Use inorder traversal for BSTs
Symptom: Finding deep match before shallow match
Python output pattern:
Found: config.json at depth 5
(but there's one at depth 2)
Diagnostic clues: - Using DFS - Need nearest/shallowest match
Root cause: DFS explores one branch completely before others
Solution: Use BFS to find shallowest match first
Lessons Learned
-
Traversal order matters: The order you visit nodes determines what information is available when you process each node.
-
Parent-child dependencies:
- Need parent info? Use preorder
- Need child info? Use postorder
-
Need sorted order (BST)? Use inorder
-
Depth vs breadth:
- DFS: Better for exploring all paths, uses less memory for wide trees
-
BFS: Better for finding nearest match, uses less memory for deep trees
-
Recursion is natural for DFS: The call stack naturally implements depth-first exploration.
-
BFS requires explicit queue: Can't easily implement BFS recursively because we need to process nodes level by level.
-
Binary trees are special: Inorder traversal only makes sense for binary trees where there's a meaningful "middle" child.
-
Choose based on problem requirements: Don't default to one traversalβanalyze what order you need to process nodes.
Practice Problems
Try implementing these using the appropriate traversal:
- Count leaf nodes (nodes with no children)
-
Hint: Any traversal works, but postorder is most natural
-
Find maximum depth of tree
-
Hint: Postorder (need child depths first)
-
Print tree structure with indentation
-
Hint: Preorder (print parent, then indent children)
-
Find all paths from root to leaves
-
Hint: Preorder with path tracking
-
Check if tree is balanced (left and right subtree heights differ by β€ 1)
-
Hint: Postorder (need subtree heights)
-
Find level of a specific node
-
Hint: BFS (tracks level naturally)
-
Serialize tree to string (save to file)
-
Hint: Preorder (parent before children)
-
Deserialize string to tree (load from file)
-
Hint: Preorder (reconstruct parent before children)
-
Find lowest common ancestor of two nodes
-
Hint: Postorder (need subtree information)
-
Convert BST to sorted doubly linked list
- Hint: Inorder (processes in sorted order)